Amoszine 8

home *** CD-ROM | disk | FTP | other *** search

/ Amoszine 8 / Amoszine 8 (Disk 1 of 3).adf / ARTS / AS_Steve_Prob.asc / AS_Steve_Prob.asc

Wrap

Text File | 1998-04-26 | 22.0 KB | 447 lines

@2{USING LINKED LISTS IN AMOS @3 }By Andy (I'm Gonna Solve It) Smith @5 Solving Steve 'Lamer' Bye's directory reading problem ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @1 Originally, I was going to write articles on Linked Lists, and using the system libraries. However, when I saw Steve Bye's plea for help with a directory reader I thought,'Hey! I can write an article on that and include the articles I was going to write.' So that's what this article is about. @3 Right, down to business. @6 Linked Lists ~~~~~~~~~~~~ @4 Steve's problem is quite a difficult problem to solve at first glance due to it's complexity and size. However, all things in programming may appear difficult at first but when the problem is broken down into managable chunks, it suddenly becomes easy. The first hurdle to jump over is how to store the filenames and directory names. @4 Most people will opt for arrays, because they know how they work and they are easy to implement. But stop to think for one moment. @5 Do you know many directories and files there are? I can tell you now that you don't, without a doubt. So the problem this presents is how many elements to allocate in the array? 10? 50? 5,000,000? You could define an array to hold 500 elements. But what happens if there are more files? Your program crashes, thats what. Even if it is enough the other problem is that lots of memory is being wasted, due to the elements of the array not being used. From this, you just might have guessed that arrays are not suitable data structures to use for storing filenames, and you would be correct in your assumption. @3 The most suitable data structure to use is the Linked List for reasons which will become apparent in a while. In fact Linked Lists have been covered before in Amoszine, Issue 6 to be exact, written by none other than Thomas Lancaster. Unfortunately, Thomas missed the point of Linked Lists, I hope my explanation of them clarifies things. @7 The differences between arrays and Linked Lists ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @4 The difference between arrays and linked lists is that the array is a 'static data structure'. This means that once it is defined, its size is fixed to how many elements you specified. It cannot shrink or expand in size, so if you fill your array, then tough! @5 A Linked List however is a 'dynamic data structure' which means its size is not predefined and it can expand and shrink when it takes your fancy. @3 The elements of an array are stored consecutively in memory as the diagram below shows: @1 _____ _____ _____ _____ _____ _____ _____ _____ | | | | | | | | | Values | 23 | 2 | 13 | 44 | 76 | 1 | 0 | 99 | |_____|_____|_____|_____|_____|_____|_____|_____| Memory Location 1024 1028 1032 1036 1040 1044 1048 1052 @4 You may have noticed that the memory locations are incremented in steps of 4. Why is this? Because when you define an array to hold numbers, it takes four bytes. This allows truly massive numbers to be stored. @3 The diagram of a linked list is very different, here it is: @1 _____ _____ _____ _____ _____ _____ | | | | | | | | | | 23 | ------->| 2 | ------->| 13 | NULL| |_____|_____| |_____|_____| |_____|_____| ^ ^ ^ ^ ^ ^ | | | | | | Value | Value | Value | | | | Pointer Pointer Pointer @4 In the diagram above, an element is the rectangle bit divided into two squares. The left square is your actual data you want to store, whether it is a number or a string it doesn't really matter. @5 The square on the right of the element is whats known as a 'pointer'. A pointer is a variable like any other, except it stores addresses. In AMOS there is no 'pointer type' so we just use an ordinary integer variable. This pointer contains the address of the next element in the list. It is important to note however that the elements are NOT stored consecutively in memory, like arrays. The elements of a linked list can be ANYWHERE in memory. This is why a pointer is needed. Notice the arrows from the right of each element pointing to the left hand side of the next element. This shows you that each element points to (contains the address of) the next element. Also notice that the last element's pointer contains NULL. This is because we don't want it to point anywhere as it is the last element. Therefore we assign the pointer the value of NULL, which is zero in most cases. @3 IMPORTANT NOTE: In arrays, you can access an element directly. So if you wanted to access the third element of an array called HELLO, you would use HELLO(2) (don't forget that arrays start from 0!!). In linked lists however, you can't. If you want a particular element from the list, you have to start from the beginning of the list and go through it until you find the desired item. @4 One last thing to note about linked lists, each element can contain 2 pointers if you wish. The address of the previous element, and the address of the next element. This is called a 2-way list, we have utilised 1-way lists only. @1 Back to Thomas Lancaster's article on Linked lists, he says you can use arrays to create linked lists. Wrong. The fact that an array's size is fixed means that you just can't use an array to implement a linked list, it defeats the whole object of using a linked list in the first place! A linked list changes its size remember! In fact Thomas says that using an array to implement a linked list "means that a maximum of 10 pieces of data can be retained". This is exactly why you cannot use arrays, it is restricting. @5 Further on, he mentions another method which he says is more efficient. His idea is to use disk files to store the pointers. How can it be more efficient if it is a file on a disk? Disk drives are many more times slower than memory. Also, you can only open a maximum of 10 files which means you are restricted again. I hope that clarifies things for Thomas and other people who don't quite understand the theory of linked lists. @3 Anyway, I had to get that off my chest, you wouldn't want people to think that Amoszine spreads misinformation! @7 Implementing linked lists in AMOS ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @4 Blitz Basic 2 programmers think AMOS it not capable of very much. Not true, AMOS can do anything any other language can do. Perhaps not as effectively, but it can be done. I can't understand why no-one uses linked lists in AMOS because it is really simple. It's almost as simple as C. The major thing missing from AMOS are 'structures'. A structure in C is just another name for a record. In C, we need to utilise structures to be able to use linked lists. A suitable structure for a linked list could be as follows: @1 struct { char filename[80]; struct node *next; } node; @5 That chunk of code reserves space for a structure (record) called node. The declarations in the braces are the structures fields. The 'char filename[80]' line reserves an array of characters (a string basically) called filename which can hold 80 characters. @4 The next declaration 'struct node *next' declares a pointer called 'next' which contains an address. It is this address that points to the next element. That structure is in actual fact one element of the linked list. Refer back to the diagram of a linked list. @3 As I said, structures are not implemented in AMOS, so this presents a problem. When structures are defined all the elements in that structure are stored one after the other in memory. So we need to replicate this in AMOS, which is quite easy. All we need to do is reserve a wad of memory but you need to decide how much to reserve. This is quite easy. Look at the structure above. @5 First we defined a string to hold 80 characters right? So to start with, we know we need to reserve as least 80 bytes. Next we need to store the pointer (the address of the next element). @4 Addresses are stored as long words (4 bytes) so we need 4 bytes for that. So the total size of our structure is 84 bytes long. @3 Unfortunately, AMOS presents us with yet another problem. The only way in AMOS to reserve memory is to use a 'Reserve As.....' instruction. Example: @1 Reserve as Work 15,84 @5 That command would reserve a memory bank 84 bytes long. It would be allocated using slot 15 of the memory banks. Can you see what the problem is yet?? @4 Well, in the original AMOS, you could only reserve banks 1 to 15, so we could only store 15 filenames! In AMOS Pro however, the memory bank system was vastly overhauled and you could reserve up to 65535 banks. But we are still constrained by what we can reserve. I also bet that there are more than 65535 files on someones hard drive (including sub-directories of course). Also, if we did use memory banks, we would need to increment the slot number (the first parameter in the Reserve As ..... instruction) each time we wanted to reserve memory. The solution then is to go down to operating system level and use the functions in the Exec library. AAAArrrrrrrgggggggghhhhhhhhh, I hear you all scream! Don't worry, it's really easy. The function we need to use is called AllocMem() and strangely enough it reserves some memory for us. I won't go into the mechanics of calling system libraries here as this article will get too long. If you want to learn how to get your fingers dirty, then read my other article in this issue on how to do such things! All you need to know now though is that we will be using this function. @3 Developing the Algorithm ~~~~~~~~~~~~~~~~~~~~~~~~@4 We now know basically what we are doing. However, it isn't time to jump to your keyboards and start coding. You need to design it on paper first. This is VERY important. When you are developing algorithms you are not quite sure about you should ALWAYS design it first and dry run it to save you time and effort. Many a time I have jumped in head first and came out with no results, it is frustrating I can assure you. So here is my pseudo-code for the linked list program: @1 get the first filename; allocate some memory enough for a structure and remember the start address; REPEAT store filename in structure; allocate some memory enough for a structure and remember the start address; get start address and store in pointer element of previous structure; get next filename; UNTIL NO MORE FILES; store last filename in structure; put NULL value into this elements pointer; @4 That pseudo-code will read in a list of filenames and store each one in its own structure. Time to refine it more. This time our pseudo-code contains english and AMOS statements. @1 ' Reserve 42 bytes of memory and store start address in ' LIST LIST=AllocMem(42); ' Remember the start address of our list, store in CURRENT CURRENT=LIST; Get first filename; REPEAT ' Store filename in memory area just reserved populate_structure(CURRENT,filename); ' Reserve 42 bytes of memory and store start address in ' NEWLOCATION NEWLOCATION=AllocMem(42); ' Now this bit stores the address of our newly reserved ' memory area in the memory area we reserved previously. ' Loke is an AMOS command that stores a 4-byte number in ' memory you specify. This command has the effect of ' storing the address of the next element in the current ' element. Loke CURRENT+38,NEWLOCATION; ' We now move onto the next element, which we just ' reserved. The CURRENT variable contains the address of ' the element we are currently accessing. CURRENT=NEWLOCATION; get next filename; UNTIL NO MORE FILES; ' Store blank filename in last element populate_structure(CURRENT,filename); ' Store NULL value in pointer field of structure. Loke CURRENT+38,0 @5 You may be wondering why I have reserved 42 bytes of memory. This is because the Dir First$ and Dir Next$ functions return a string of 38 characters, plus we need a pointer as well, another 4 bytes, so that makes 42 bytes in total. @4 The pseudo-code above was very close to the AMOS equivalent so now I present the AMOS version of the linked list program. It is also on the disk, it is called Linked_List.AMOS. @1 ' Reserve 42 bytes of memory and store start address in ' LIST _ALLOC_MEM@1[@142,0] LIST=Param ' Remember the start address of our list, store in CURRENT CURRENT=LIST ' Get first filename FILE$=Dir First$("**") Repeat ' Store filename in memory area just reserved _POPULATE[CURRENT,FILE$] ' Reserve 42 bytes of memory and store start address in ' NEWLOCATION _ALLOC_MEM@1[@142,0] NEWLOCATION=Param ' Now this bit stores the address of our newly reserved ' memory area in the memory area we reserved previously. ' Loke is an AMOS command that stores a 4-byte number in ' memory you specify. This command has the effect of ' storing the address of the next element in the current ' element. Loke CURRENT+38,NEWLOCATION ' We now move onto the next element, which we just ' reserved. The CURRENT variable contains the address of ' the element we are currently accessing. CURRENT=NEWLOCATION ' Get next filename FILE$=Dir Next$ Until FILE$="" _POPULATE[CURRENT,FILE$] Loke CURRENT+38,0 Procedure _POPULATE[CURRENT,LINE$] Procedure _ALLOC_MEM[SIZE,REQUIREMENTS] @4 If you compare this AMOS program and the pseudo-code that preceeded it you will notice they are very similar. Load AMOS now and load the file 'Linked_List.AMOS' into the editor. Try it out and you will learn! Notice that in the source code that we have a few more procedures. Namely _FREE_LIST and _FREE_MEM. You should call the _FREE_LIST procedure when you have finished with your list, otherwise it will take up valuable memory. The _FREE_MEM procedure is called from inside the _FREE_LIST procedure and basically it deletes one element from our list. @5 Believe it or not, but we have jumped a major hurdle in Steve's problem. We can store as many filenames as we like without restrictions. All we need to do now is to get the entire directory structure of a disk! @3 Recursion recursion recursion recursion........... ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~@4 Now we have linked lists out of the way, we can now deal with scanning all files of a disk. The method we are going to use is called recursion. Recursion is where a procedure calls itself based on some condition. The reason we are using recursion to scan directories and subdirectories is because it is much simpler to code. The following pseudo- code will scan a disk recursively: @1 PROCEDURE _SCAN_DISK[filename] scan directory for first file repeat if item scanned = directory THEN _SCAN_DISK[item scanned] ELSE print filename END IF scan directory for next file until NO MORE FILES END PROC @5 The above pseudo-code fragment loops until it can't find any more files. While it is scanning, if it comes across a file, then it prints it out, with the full path attached. However, if it finds a directory then it calls itself i.e. the _SCAN_DISK procedure is called from within itself. This is what enables it to scan directories. When there are no more files in the directory, then the procedure ends and control returns to the previous procedure. As good as recursion may sound, it does have drawbacks. Firstly, if recursive procedures are not controlled, the stack could overflow, resulting in your program crashing. You can change the size of the stack by including a Set Stack instruction in your programs. It requires one parameter, the number of recursive calls allowed. It is unlikely that anybody has more than 10 nested directories so there is no real problem. @4 When you call a procedure, or subroutine, the address of the Program Counter is pushed onto the stack. When the procedure terminates, the address is popped off the stack. However, when using recursion, addresses keep getting pushed onto the stack and if there isn't enough room, it overflows. @3 There is another drawback to using recursion, that is AMOS itself. AMOS isn't too friendly with recursion at all, it can crash, and sometimes programs that use recursion don't compile properly. It makes you think sometimes if François Lionet can actually program properly. Included on this disk (if Andy Gibson remembers to include my source code) is the source code for our directory reading program, it is called Dirstruc.AMOS. Load it into the editor and run it. When it prompts you for a directory to scan, choose DF0:. @5 Don't scan your hard drive as it crashes, this is due to AMOS again. It should display ALL your files on your floppy disk. If you have a look at the code, you will see I'm not using the Dir First$ and Dir Next$ commands to scan a directory. Why? Because they are fraught with problems. Firstly, it insists on tagging the filesize onto the returned string. Secondly, it doesn't work with recursion. So I am using functions from the dos.library. Don't worry yourself too much at this stage on how it works, just accept that it does! If you are REALLY interested in how it all works, read my tutorial on calling libraries. @4 Steve wants the directory structure to be sent to a file. This could easily be done by using Print# statements instead of using Print statements. The trouble with this method is that the drive heads will thrash. By this I mean whilst it is reading a directory, it will have to write to a file somewhere else, which will inevitably slow the program down a great deal. This is where our theory on linked lists comes in. Instead of writing the filenames to the screen or a file, we shall store it in a linked list. When the entire disk has been scanned, we simply scan from the start of the list to the end, reading each filename from it and writing it into a file. Load the file DirStruc2.AMOS into the editor and look at the source code. Then try it out. But please note, it is VERY VERY dodgy, and there's nothing I can do to rectify it. @3 AMOS has a lot to answer for sometimes. I have also included a C program called DirStruc.C which does the same as the AMOS program, but more efficiently and it doesn't crash. If you have DICE it will compile under that. If you don't, then there's an executable version that you can use from the shell. It may be worth noting that you could easily store the executable version in a memory bank and run it, solving our crashing problem. It is a quick and dirty method of doing it, but AMOS leaves us with no choice! The executable version is called 'dirstruc' funnily enough. It needs to be run from the shell. @4 Well that's it for this article, I hope you found it informative. If there is something you don't quite understand, contact me and I'll try to help you. If you want to see how to use library functions, then read my other article on this subject! Bye for now. @1 end.